智能论文笔记

An Interactive Automation for Human Biliary Tree Diagnosis Using Computer Vision

Mohammad AL-Oudat , Saleh Alomari , Hazem Qattous , Mohammad Azzeh , Tariq AL-Munaizel

分类：计算机视觉 | 机器学习

2022-09-10

胆道是一个管网络，将肝脏与胆囊连接到胆囊，这是一个正下方的器官。胆管是胆汁树中的主要管。胆管的扩张是人体中更多主要问题的关键指标，例如石头和肿瘤，这些问题通常是由胰腺或Vater的乳头状引起的。在许多情况下，胆管扩张的检测对于初学者或未经训练的医务人员来说可能具有挑战性。即使是专业人士也无法用肉眼检测到胆管扩张。这项研究提出了一种基于视觉的独特模型，用于初始诊断。为了从磁共振图像分割胆道树，框架使用了不同的图像处理方法（MRI）。在对图像的感兴趣区域进行了细分后，对其进行了许多计算，以提取10个特征，包括主要轴和次要轴，胆管区域，胆汁树面积，紧凑性和某些纹理特征（对比度，平均值，方差和相关性）。这项研究使用了约旦安曼国王侯赛因医学中心的图像数据库，其中包括200张MRI图像，100例正常病例和100例胆管扩张的患者。提取特征后，使用各种分类器来确定患者的健康状况（正常或扩张）。研究结果表明，提取的特征在曲线下的准确性和面积方面与所有分类器都很好。这项研究的独特之处在于，它使用自动方法从MRI图像中分割胆汁树，并且科学地将检索到的特征与胆道树状态相关联，而文献中从未做过。

translated by 谷歌翻译

Data Augmentation using Transformers and Similarity Measures for Improving Arabic Text Classification

Dania Refai , Saleh Abo-Soud , Mohammad Abdel-Rahman

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-28

Learning models are highly dependent on data to work effectively, and they give a better performance upon training on big datasets. Massive research exists in the literature to address the dataset adequacy issue. One promising approach for solving dataset adequacy issues is the data augmentation (DA) approach. In DA, the amount of training data instances is increased by making different transformations on the available data instances to generate new correct and representative data instances. DA increases the dataset size and its variability, which enhances the model performance and its prediction accuracy. DA also solves the class imbalance problem in the classification learning techniques. Few studies have recently considered DA in the Arabic language. These studies rely on traditional augmentation approaches, such as paraphrasing by using rules or noising-based techniques. In this paper, we propose a new Arabic DA method that employs the recent powerful modeling technique, namely the AraGPT-2, for the augmentation process. The generated sentences are evaluated in terms of context, semantics, diversity, and novelty using the Euclidean, cosine, Jaccard, and BLEU distances. Finally, the AraBERT transformer is used on sentiment classification tasks to evaluate the classification performance of the augmented Arabic dataset. The experiments were conducted on four sentiment Arabic datasets, namely AraSarcasm, ASTD, ATT, and MOVIE. The selected datasets vary in size, label number, and unbalanced classes. The results show that the proposed methodology enhanced the Arabic sentiment text classification on all datasets with an increase in F1 score by 4% in AraSarcasm, 6% in ASTD, 9% in ATT, and 13% in MOVIE.

translated by 谷歌翻译

Data-driven control of COVID-19 in buildings: a reinforcement-learning approach

Ashkan Haji Hosseinloo , Saleh Nabi , Anette Hosoi , Munther A. Dahleh

分类：人工智能 | 机器学习

2022-12-27

In addition to its public health crisis, COVID-19 pandemic has led to the shutdown and closure of workplaces with an estimated total cost of more than $16 trillion. Given the long hours an average person spends in buildings and indoor environments, this research article proposes data-driven control strategies to design optimal indoor airflow to minimize the exposure of occupants to viral pathogens in built environments. A general control framework is put forward for designing an optimal velocity field and proximal policy optimization, a reinforcement learning algorithm is employed to solve the control problem in a data-driven fashion. The same framework is used for optimal placement of disinfectants to neutralize the viral pathogens as an alternative to the airflow design when the latter is practically infeasible or hard to implement. We show, via simulation experiments, that the control agent learns the optimal policy in both scenarios within a reasonable time. The proposed data-driven control framework in this study will have significant societal and economic benefits by setting the foundation for an improved methodology in designing case-specific infection control guidelines that can be realized by affordable ventilation devices and disinfectants.

translated by 谷歌翻译

Improving the Robustness of Summarization Models by Detecting and Removing Input Noise

Kundan Krishna , Yao Zhao , Jie Ren , Balaji Lakshminarayanan , Jiaming Luo , Mohammad Saleh , Peter J. Liu

分类：自然语言处理 | 机器学习

2022-12-20

The evaluation of abstractive summarization models typically uses test data that is identically distributed as training data. In real-world practice, documents to be summarized may contain input noise caused by text extraction artifacts or data pipeline bugs. The robustness of model performance under distribution shift caused by such noise is relatively under-studied. We present a large empirical study quantifying the sometimes severe loss in performance (up to 12 ROUGE-1 points) from different types of input noise for a range of datasets and model sizes. We then propose a light-weight method for detecting and removing such noise in the input during model inference without requiring any extra training, auxiliary models, or even prior knowledge of the type of noise. Our proposed approach effectively mitigates the loss in performance, recovering a large fraction of the performance drop, sometimes as large as 11 ROUGE-1 points.

translated by 谷歌翻译

Adaptive Uncertainty Distribution in Deep Learning for Unsupervised Underwater Image Enhancement

Alzayat Saleh , Marcus Sheaves , Dean Jerry , Mostafa Rahimi Azghadi

分类：计算机视觉

2022-12-18

One of the main challenges in deep learning-based underwater image enhancement is the limited availability of high-quality training data. Underwater images are difficult to capture and are often of poor quality due to the distortion and loss of colour and contrast in water. This makes it difficult to train supervised deep learning models on large and diverse datasets, which can limit the model's performance. In this paper, we explore an alternative approach to supervised underwater image enhancement. Specifically, we propose a novel unsupervised underwater image enhancement framework that employs a conditional variational autoencoder (cVAE) to train a deep learning model with probabilistic adaptive instance normalization (PAdaIN) and statistically guided multi-colour space stretch that produces realistic underwater images. The resulting framework is composed of a U-Net as a feature extractor and a PAdaIN to encode the uncertainty, which we call UDnet. To improve the visual quality of the images generated by UDnet, we use a statistically guided multi-colour space stretch module that ensures visual consistency with the input image and provides an alternative to training using a ground truth image. The proposed model does not need manual human annotation and can learn with a limited amount of data and achieves state-of-the-art results on underwater images. We evaluated our proposed framework on eight publicly-available datasets. The results show that our proposed framework yields competitive performance compared to other state-of-the-art approaches in quantitative as well as qualitative metrics. Code available at https://github.com/alzayats/UDnet .

translated by 谷歌翻译

RIGA: Rotation-Invariant and Globally-Aware Descriptors for Point Cloud Registration

Hao Yu , Ji Hou , Zheng Qin , Mahdi Saleh , Ivan Shugurov , Kai Wang , Benjamin Busam , Slobodan Ilic

分类：计算机视觉

2022-09-27

成功的点云注册依赖于在强大的描述符上建立的准确对应关系。但是，现有的神经描述符要么利用旋转变化的主链，其性能在较大的旋转下下降，要么编码局部几何形状，而局部几何形状不太明显。为了解决这个问题，我们介绍Riga以学习由设计和全球了解的旋转不变的描述符。从稀疏局部区域的点对特征（PPF）中，旋转不变的局部几何形状被编码为几何描述符。随后，全球对3D结构和几何环境的认识都以旋转不变的方式合并。更具体地说，整个框架的3D结构首先由我们的全球PPF签名表示，从中学到了结构描述符，以帮助几何描述符感知本地区域以外的3D世界。然后将整个场景的几何上下文全局汇总到描述符中。最后，将稀疏区域的描述插值到密集的点描述符，从中提取对应关系进行注册。为了验证我们的方法，我们对对象和场景级数据进行了广泛的实验。在旋转较大的情况下，Riga就模型Net40的相对旋转误差而超过了最先进的方法8 \度，并将特征匹配的回忆提高了3DLOMATCH上的至少5个百分点。

translated by 谷歌翻译

Traffic Accident Risk Forecasting using Contextual Vision Transformers

Khaled Saleh , Artur Grigorev , Adriana-Simona Mihaita

分类：计算机视觉 | 人工智能

2022-09-20

最近，由于其对交通清算的重大影响，交通事故风险预测的问题一直引起了智能运输系统社区的关注。通过使用数据驱动的方法来对空间和时间事件的影响进行建模，因此在文献中通常可以解决此问题，因为它们被证明对于交通事故风险预测问题至关重要。为了实现这一目标，大多数方法构建了不同的体系结构以捕获时空相关性功能，从而使它们对大型交通事故数据集效率低下。因此，在这项工作中，我们提出了一个新颖的统一框架，即是上下文视觉变压器，可以通过端到端的方法进行培训，该方法可以有效地建议问题的空间和时间方面，同时提供准确的交通事故。风险预测。我们评估并比较了我们提出的方法的性能与来自两个不同地理位置的两个大规模交通事故数据集的文献的基线方法。结果表明，与文献中先前的最新作品（SOTA）相比，RMSE得分的重大改善大约为2 \％。此外，我们提出的方法在两个数据集上优于SOTA技术，而仅需要少23倍的计算要求。

translated by 谷歌翻译

LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging

Andy Rosenbaum , Saleh Soltan , Wael Hamza , Yannick Versley , Markus Boese

分类：自然语言处理 | 人工智能 | 机器学习

2022-09-20

我们提出语言学家，这是一种通过微调Alexatm 5B生成带注释数据的方法，用于生成意图分类和插槽标记（IC+ST），这是一种5亿参数的多语言序列到序列（SEQ2SEQ）模型，在灵活的指令上迅速的。在SNIP数据集的10次新颖意图设置中，语言学家超过了最新的方法（反向翻译和示例外推），可以通过宽阔的边距，显示出IC回忆中+1.9点的目标意图的绝对改善ST F1分数和+2.5分。在MATIS ++数据集的零击跨语言设置中，语言学家表现出强大的机器翻译基线，插槽对齐的基线是+4.14的+4.14点在6个语言上绝对在ST F1分数上，同时在IC上匹配IC的性能。最后，我们在用于对话代理IC+ST的内部大规模多语言数据集上验证了我们的结果，并显示了使用背面翻译，释义和插槽目录重新采样采样的基线的显着改进。据我们所知，我们是第一个展示大规模SEQ2SEQ模型的指导微调的人，以控制多语言意图和插槽标记的数据生成的输出。

translated by 谷歌翻译

Traffic incident duration prediction via a deep learning framework for text description encoding

Artur Grigorev , Adriana-Simona Mihaita , Khaled Saleh , Massimo Piccardi

分类：机器学习

2022-09-19

由于时空事件发生的随机性，在报告的交通中断开始时缺乏信息，并且缺乏运输工程的高级方法来从过去中获得见解，因此预测交通事故持续时间是一个难题事故。本文提出了一个新的Fusion框架，用于通过将机器学习与交通流量/速度和事件描述作为功能进行集成来预测有限信息的事件持续时间，并通过多种深度学习方法编码（ANN AUTOCONEDER和角色级别的LSTM-ANN情绪分类器）。该论文在运输和数据科学中构建了跨学科建模方法。该方法提高了适用于基线事件报告的最佳表现ML模型的入射持续时间预测准确性。结果表明，与标准线性或支持矢量回归模型相比，我们提出的方法可以提高准确性$ 60 \％$，并且相对于混合深度学习自动编码的GBDT模型的另外7美元\％$改进，这似乎胜过表现所有其他模型。应用区是旧金山市，富含交通事件日志（全国交通事故数据集）和过去的历史交通拥堵信息（Caltrans绩效测量系统的5分钟精度测量）。

translated by 谷歌翻译

A lightweight Transformer-based model for fish landmark detection

Alzayat Saleh , David Jones , Dean Jerry , Mostafa Rahimi Azghadi

分类：计算机视觉

2022-09-13

当有足够的训练数据时，在某些视力任务中，基于变压器的模型（例如Vision Transformer（VIT））可以超越跨趋化神经网络（CNN）。然而，（CNN）对视力任务（即翻译均衡和局部性）具有强大而有用的归纳偏见。在这项工作中，我们开发了一种新颖的模型架构，我们称之为移动鱼类地标检测网络（MFLD-NET）。我们已经使用基于VIT的卷积操作（即斑块嵌入，多层感知器）制作了该模型。 MFLD-NET可以在轻巧的同时获得竞争性或更好的结果，同时轻巧，因此适用于嵌入式和移动设备。此外，我们表明MFLD-NET可以在PAR上获得关键点（地标）估计精度，甚至比FISH图像数据集上的某些最先进的（CNN）更好。此外，与VIT不同，MFLD-NET不需要预训练的模型，并且在小型数据集中训练时可以很好地概括。我们提供定量和定性的结果，以证明该模型的概括能力。这项工作将为未来开发移动但高效的鱼类监测系统和设备的努力奠定基础。

translated by 谷歌翻译